Let's start!

CiberAMP requires several packages that will be automatically installed:

Gene query

The algorithm requires two mandatory inputs. The first one is a list of gene symbols to analyze. This can range from a single transcript to the whole set of human genes.

genes.of.interest <- c("GENEA", "GENEB", "GENEC", ...)

In this example, we will study which are genes are differentially expressed in association with their SCNVs in the cohort of head-and-neck and lung squamous cell carcinomas (TCGA-HNSC and TCGA-LUSC, respectively). In particular, we will focus our analysis on genes lodge at chromosomes 3, 5 and 8.

Getting genes from specific chromosomal locations

The first step is to create a vector with the symbols of the genes that are located at chromosomes 3, 5 and 8:

library(biomaRt)
library(stringr)
# First, we get all the genes at chromosomes 3, 5 and 8 using the biomaRt R package. If you have the last version of R, the you can set "TRUE" the "useCache" argument in getBM function.

ensembl <- useEnsembl(biomart = "ensembl", dataset = "hsapiens_gene_ensembl", mirror = "useast")
genes <- getBM(attributes=c('chromosome_name', 'band', 'start_position', 'end_position', 'strand', 'hgnc_symbol'), filters = "chromosome_name", values = c("3", "5", "8"), mart=ensembl, useCache = FALSE)
genes <- genes[genes$hgnc_symbol != "", ]
genes <- genes[!duplicated(genes$hgnc_symbol), ]
for(i in 1:nrow(genes)) {
  if(genes$strand[i] < 0) {
    genes$strand[i] <- "-"
  }else{
    genes$strand[i] <- "+"
  }
}

Running CiberAMP

Now, we have (1) the list of gene symbols to query and (2) the TCGA cohorts to analyze (TCGA-HNSC and TCGA-LUSC). CiberAMP also allows you to designate a path to save your results. By default, it will use the current working directory.

library(ciberAMP)
# x <- ciberAMP(genes = reg.genes$hgnc_symbol, cohorts = c("LUSC", "HNSC"), pat.percentage = 0, *writePath = "PATH_TO_FOLDER"*)

However, there are many arguments that allow you to personalize your analysis:

In this example, we will run the analysis with default parameters:

x <- ciberAMP(genes = as.character(genes$hgnc_symbol), cohorts = c("HNSC", "LUSC"))

Looking into CiberAMP results

CiberAMP results into a list of 3 data frames that can be accessed by:

The x[[1]] data frame contains all differentially expressed genes between (1) tumor vs normal and (2) copy number altered vs diploid tumor samples:

#To get the first queried gene (first row) results.
head(x[[1]][1, ])

The x[[2]] data frame containing CGC list differential expression results. Columns are the same as x[[1]].

head(x[[2]][1, ])

Finally, the x[[3]] data frame contains SCNVs co-occurrency between the SCNV-DEGs and known cancer driver genes amplification and deletions:

head(x[[3]][1, ])

Mapping results into chromosomes

From the original 1870 genes we found that:

They are mapped at their chromosomal positions:

library(chromPlot)
df <- x[[1]]
genes_cosmic <- x[[3]]
gr.assoc.hnsc <- gr[gr$hgnc_symbol %in% as.character(df[df$Tumor %in% "HNSC" & abs(df$log2FC.SCNAvsDip) >= 1 & df$FDR.SCNAvsDip <= 0.01, ]$Gene_Symbol), ]
gr.assoc.lusc <- gr[gr$hgnc_symbol %in% as.character(df[df$Tumor %in% "LUSC" & abs(df$log2FC.SCNAvsDip) >= 1 & df$FDR.SCNAvsDip <= 0.01, ]$Gene_Symbol), ]
both <- intersect(gr.assoc.hnsc$hgnc_symbol, gr.assoc.lusc$hgnc_symbol)
chromPlot(gaps=hg_gap, annot1 = gr[gr$hgnc_symbol %in% both, ], chr = c(3,5,8))

To answer if these genes are very close to any SCN-altered CGC in the cohort, we visually plot them (red bars) and the CGC co-occurring ones (yellow bars):

library(chromPlot)
df.cosmic <- x[[2]]
gr.assoc.hnsc.cosmic <- gr[gr$hgnc_symbol %in% as.character(df.cosmic[df.cosmic$Tumor %in% "HNSC" & abs(df.cosmic$log2FC.SCNAvsDip) >= 1 & df.cosmic$FDR.SCNAvsDip <= 0.01, ]$Gene_Symbol), ]
gr.assoc.lusc.cosmic <- gr[gr$hgnc_symbol %in% as.character(df.cosmic[df.cosmic$Tumor %in% "LUSC" & abs(df.cosmic$log2FC.SCNAvsDip) >= 1 & df.cosmic$FDR.SCNAvsDip <= 0.01, ]$Gene_Symbol), ]
both.cosmic <- intersect(gr.assoc.hnsc.cosmic$hgnc_symbol, gr.assoc.lusc.cosmic$hgnc_symbol)
chromPlot(gaps=hg_gap, annot1 = gr[gr$hgnc_symbol %in% both, ], annot2 = gr[gr$hgnc_symbol %in% both.cosmic, ], chr = c(3,5,8), chrSide=c(-1,1,1,1,1,1,1,1))

Ggplotting the results

CiberAMP package provides a ggplot-based function to create a scatter plot where:

In the end, a significantly altered gene can be classified in 8 scenarios:

Let's take a look on those 67 resulting genes:

library(ggplot2)
df.exp <- x[[1]]
df.exp <- df.exp[df.exp$Gene_Symbol %in% both, ]
ggplot.CiberAMP(df.exp)

Interacting with the results

Finally, CiberAMP provides a function to interactively explore your results. In this app, you can filter genes by:

Clicking on any dot will display a table with the overlapping % of samples in which a SCN-deregulated queried genes co-occurs with any CGC oncodrivers' SCNAs.

Tip: Click on the 'submit' button to initialize the graph!! If 'ALL' is written in the box, then every SCN-altered queried gene from your results will be plotted.

library(shiny)
library(plotly)
library(DT)
int.plot.CiberAMP(df.exp, x[[3]])

Session info {.unnumbered}

sessionInfo()


vqf/ciberAMP documentation built on April 12, 2022, 12:45 p.m.